359 research outputs found

    T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5

    Full text link
    In Spoken language understanding (SLU), a natural solution is concatenating pre-trained speech models (e.g. HuBERT) and pretrained language models (PLM, e.g. T5). Most previous works use pretrained language models with subword-based tokenization. However, the granularity of input units affects the alignment of speech model outputs and language model inputs, and PLM with character-based tokenization is underexplored. In this work, we conduct extensive studies on how PLMs with different tokenization strategies affect spoken language understanding task including spoken question answering (SQA) and speech translation (ST). We further extend the idea to create T5lephone(pronounced as telephone), a variant of T5 that is pretrained using phonemicized text. We initialize T5lephone with existing PLMs to pretrain it using relatively lightweight computational resources. We reached state-of-the-art on NMSQA, and the T5lephone model exceeds T5 with other types of units on end-to-end SQA and ST

    Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning

    Full text link
    In this work, we propose a method to create domain-sensitive speech recognition models that utilize textual domain information by conditioning its generation on a given text prompt. This is accomplished by fine-tuning a pre-trained, end-to-end model (Whisper) to learn from demonstrations with prompt examples. We show that this ability can be generalized to different domains and even various prompt contexts, with our model gaining a Word Error Rate (WER) reduction of up to 33% on unseen datasets from various domains, such as medical conversation, air traffic control communication, and financial meetings. Considering the limited availability of audio-transcript pair data, we further extend our method to text-only fine-tuning to achieve domain sensitivity as well as domain adaptation. We demonstrate that our text-only fine-tuned model can also attend to various prompt contexts, with the model reaching the most WER reduction of 29% on the medical conversation dataset.Comment: F-T Liao and Y-C Chan contributed equall

    Hemispheric dispersion of radioactive plume laced with fission nuclides from the Fukushima nuclear event

    Get PDF
    Radioactivities of particulate 131I and 137Cs released from the Fukushima nuclear accident were monitored in a regional aerosol network including two high mountain sites (central Taiwan and Tibetan Plateau). The results were integrated with data measured elsewhere around the world, with special focus on the mid-latitudes. The hemispheric transport of the Fukushima radiation clouds (FRCs) by the westerlies took 18days,displayinganexponentialāˆ’likedecreaseeastward,withadilutionfactorofatleastfiveordersofmagnitudefollowingafullcircuitaroundtheglobe.TheinitialtwowavesofFRCsmaytravelatdifferentatitudes:thefirstoneat18 days, displaying an exponential-like decrease eastward, with a dilution factor of at least five orders of magnitude following a full circuit around the globe. The initial two waves of FRCs may travel at different atitudes: the first one at 3ā€“4 km, whereas the second one up to 5 km or more. 131I and 137Cs were fractionated during transport, with 137Cs concentrated in the shallower layer, susceptible to depositional removal, while 131I moving faster and higher. This accident may be exemplified to identify some atmospheric processes on the hemispheric scale

    Bridging Speech and Textual Pre-trained Models with Unsupervised ASR

    Full text link
    Spoken language understanding (SLU) is a task aiming to extract high-level semantics from spoken utterances. Previous works have investigated the use of speech self-supervised models and textual pre-trained models, which have shown reasonable improvements to various SLU tasks. However, because of the mismatched modalities between speech signals and text tokens, previous methods usually need complex designs of the frameworks. This work proposes a simple yet efficient unsupervised paradigm that connects speech and textual pre-trained models, resulting in an unsupervised speech-to-semantic pre-trained model for various tasks in SLU. To be specific, we propose to use unsupervised automatic speech recognition (ASR) as a connector that bridges different modalities used in speech and textual pre-trained models. Our experiments show that unsupervised ASR itself can improve the representations from speech self-supervised models. More importantly, it is shown as an efficient connector between speech and textual pre-trained models, improving the performances of five different SLU tasks. Notably, on spoken question answering, we reach the state-of-the-art result over the challenging NMSQA benchmark.Comment: ICASSP2023 submissio

    Extending the Pre-Training of BLOOM for Improved Support of Traditional Chinese: Models, Methods and Results

    Full text link
    In this paper we present the multilingual language model BLOOM-zh that features enhanced support for Traditional Chinese. BLOOM-zh has its origins in the open-source BLOOM models presented by BigScience in 2022. Starting from released models, we extended the pre-training of BLOOM by additional 7.4 billion tokens in Traditional Chinese and English covering a variety of domains such as news articles, books, encyclopedias, educational materials as well as spoken language. In order to show the properties of BLOOM-zh, both existing and newly created benchmark scenarios are used for evaluating the performance. BLOOM-zh outperforms its predecessor on most Traditional Chinese benchmarks while maintaining its English capability. We release all our models to the research community

    A research roadmap for quantifying non-state and subnational climate mitigation action

    Get PDF
    Non-state and subnational climate actors have become central to global climate change governance. Quantitatively assessing climate mitigation undertaken by these entities is critical to understand the credibility of this trend. In this Perspective, we make recommendations regarding five main areas of research and methodological development related to evaluating non-state and subnational climate actions: defining clear boundaries and terminology; use of common methodologies to aggregate and assess non-state and subnational contributions; systematically dealing with issues of overlap; estimating the likelihood of implementation; and addressing data gaps

    Initial Visible and Mid-IR Characterization of P/2019 LDā‚‚ (ATLAS), an Active Transitioning Centaur Among the Trojans, with Hubble, Spitzer, ZTF, Keck, APO and GROWTH Imaging and Spectroscopy

    Get PDF
    We present visible and mid-infrared imagery and photometry of Jovian co-orbital comet P/2019 LDā‚‚ (ATLAS) taken with Hubble Space Telescope/WFC3 on 2020 April 1, Spitzer Space Telescope/IRAC on 2020 January 25, Zwicky Transient Facility between 2019 April 9 and 2019 Nov 8 and the GROWTH telescope network from 2020 May to July, as well as visible spectroscopy from Keck/LRIS on 2020 August 19. Our observations indicate that LDā‚‚ has a nucleus with radius 0.2-1.8 km assuming a 0.08 albedo and that the coma is dominated by āˆ¼100 Ī¼ m-scale dust ejected at āˆ¼1 m/s speeds with a āˆ¼1" jet pointing in the SW direction. LDā‚‚ experienced a total dust mass loss of āˆ¼10āø kg and dust mass loss rate of āˆ¼6 kg/s with AfĻ/cross-section varying between āˆ¼85 cm/125 kmĀ² and āˆ¼200 cm/310 kmĀ² between 2019 April 9 and 2019 Nov 8. If the AfĻ/cross-section increase remained constant, it implies that LDā‚‚ has remained active since āˆ¼2018 November when it came within 4.8 au of the Sun, a typical distance for comets to begin sublimation of Hā‚‚O. From our 4.5 Ī¼m Spitzer observations, we set a limit on CO/COā‚‚ gas production of āˆ¼10Ā²ā·/āˆ¼10Ā²ā¶ mol/s. Multiple bandpass photometry of LDā‚‚ taken by the GROWTH network measured in a 10,000 km aperture provide color measurements of g-r = 0.59Ā±0.03, r-i = 0.18Ā±0.05, and i-z = 0.01Ā±0.07, colors typical of comets. We set a spectroscopic upper limit to the production of Hā‚‚O gas of āˆ¼80 kg/s. Improving the orbital solution for LDā‚‚ with our observations, we determine that the long-term orbit of LDā‚‚ is that of a typical Jupiter Family Comet having close encounters with Jupiter coming within āˆ¼0.5 Hill radius in the last āˆ¼3 y to within 0.8 Hill radius in āˆ¼9 y and has a 95% chance of being ejected from the Solar System in < 10 Myr

    Pathogenic LRRK2 Mutations Do Not Alter Gene Expression in Cell Model Systems or Human Brain Tissue

    Get PDF
    Point mutations in LRRK2 cause autosomal dominant Parkinson's disease. Despite extensive efforts to determine the mechanism of cell death in patients with LRRK2 mutations, the aetiology of LRRK2 PD is not well understood. To examine possible alterations in gene expression linked to the presence of LRRK2 mutations, we carried out a case versus control analysis of global gene expression in three systems: fibroblasts isolated from LRRK2 mutation carriers and healthy, non-mutation carrying controls; brain tissue from G2019S mutation carriers and controls; and HEK293 inducible LRRK2 wild type and mutant cell lines. No significant alteration in gene expression was found in these systems following correction for multiple testing. These data suggest that any alterations in basal gene expression in fibroblasts or cell lines containing mutations in LRRK2 are likely to be quantitatively small. This work suggests that LRRK2 is unlikely to play a direct role in modulation of gene expression, although it remains possible that this protein can influence mRNA expression under pathogenic cicumstances
    • ā€¦
    corecore